Modeling prosodic sequences with k-means and dirichlet process GMMs

نویسنده

  • Andrew Rosenberg
چکیده

In this paper we describe two unsupervised representations of prosodic sequences based on k-means and Dirichlet Process Gaussian Mixture Model (DPGMM) clustering. The clustering algorithms are used to infer an inventory of prosodic categories over automatically segmented syllables. A tri-gram model is trained over these sequences to characterize speech. We find that DPGMM clusters show a greater correspondence with manual ToBI labels than k-means clusters. However, sequence models trained on k-means clusters significantly outperform DPGMM sequences in classifying speaking style, nativeness and speakers. We also investigate the use of these sequence models in the detection of outliers regarding these three tasks. Non-parametric Bayesian techniques have the advantage of being able to learn a clustering solution and infer the number of clusters directly from data. While it is attractive to avoid specifying k before clustering, on the tasks of characterizing prosodic sequences we find that effective use of DPGMMs still requires a significant amount of parameter tuning, and performance fails to reach the level of k-means.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pitch-dependent GMMs for text-independent speaker recognition systems

Gaussian mixture models (GMMs) and ergodic hidden Markov models (HMMs) have been successfully applied to model short-term acoustic vectors for speaker recognition systems. Prosodic features are known to carry information concerning the speaker’s identity and they can be combined with the short-term acoustic vectors in order to increase the performance of the speaker recognition system. In this ...

متن کامل

Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog

New challenges arise for addressee detection when multiple people interact jointly with a spoken dialog system using unconstrained natural language. We study the problem of discriminating computer-directed from human-directed speech in a new corpus of human-human-computer (H-H-C) dialog, using lexical and prosodic features. The prosodic features use no word, context, or speaker information. Res...

متن کامل

Mon.O2b.04 Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog

New challenges arise for addressee detection when multiple people interact jointly with a spoken dialog system using unconstrained natural language. We study the problem of discriminating computer-directed from human-directed speech in a new corpus of human-human-computer (H-H-C) dialog, using lexical and prosodic features. The prosodic features use no word, context, or speaker information. Res...

متن کامل

Effects of compatible versus competing rhythmic grouping on errors and timing variability in speech.

In typical speech words are grouped into prosodic constituents. This study investigates how such grouping interacts with segmental sequencing patterns in the production of repetitive word sequences. We experimentally manipulated grouping behavior using a rhythmic repetition task to elicit speech for perceptual and acoustic analysis to test the hypothesis that prosodic structure and patterns of ...

متن کامل

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications—Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013